Shaping in Practice: Training Wheels to Learn Fast Hopping Directly in Hardware

نویسندگان

  • Steve Heim
  • Felix Ruppert
  • Alborz A. Sarvestani
  • Alexander Spröwitz
چکیده

Learning instead of designing robot controllers can greatly reduce engineering effort required, while also emphasizing robustness. Despite considerable progress in simulation, applying learning directly in hardware is still challenging, in part due to the necessity to explore potentially unstable parameters. We explore the concept of shaping the reward landscape with training wheels; temporary modifications of the physical hardware that facilitate learning. We demonstrate the concept with a robot leg mounted on a boom learning to hop fast. This proof of concept embodies typical challenges such as instability and contact, while being simple enough to empirically map out and visualize the reward landscape. Based on our results we propose three criteria for designing effective training wheels for learning in robotics. A video synopsis can be found at https://youtu.be/6iH5E3LrYh8.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shaping in Reinforcement Learning by Changing the Physics of the Problem

Children learn to ride a bicycle by using training wheels. They are actually trying to learn one task (riding without training wheels) by training another one. In general, solving a difficult problem can be facilitated by training other problems. This is the basic idea of shaping. It is essential to ensure that spending time on the modified task will help solving the original one. In this paper...

متن کامل

Comparison of the Effect of 6 Weeks of Balancing and Hopping Strengthening Training on the Kinematics of the Lower Extremities of Athletes with Functional Ankle Instability while Running: A Randomized Controlled Trial

Introduction: Ankle sprains are one of the most common sports injuries. This injury can affect the kinematics of the athletechr('39')s lower extremities. Therefore, the aim of this study was to compare the effect of 6 weeks of balancing and hopping strengthening training on the kinematics of the lower extremities of athletes with functional ankle instability while running. Methods: The present...

متن کامل

ROCK∗ - Efficient black-box optimization for policy learning

Robotic learning on real hardware requires an efficient algorithm which minimizes the number of trials needed to learn an optimal policy. Prolonged use of hardware causes wear and tear on the system and demands more attention from an operator. To this end, we present a novel black-box optimization algorithm, Reward Optimization with Compact Kernels and fast natural gradient regression (ROCK). O...

متن کامل

Effect of six week hopping exercise on time to stabilization and perceived instability of athlete with chronic ankle instability during single leg jump landing

Introduction: The purpose of this study was to examine the effect of 6 weeks hopping exercises program on time to stabilization and perceived stability in athletes with chronic ankle instability Methods: twenty-eight basketball player with chronic ankle instability (mean ± SD age;22.67±2.88 years, mean ± SD weight; 80.47±8.48 kg, mean ± SD height; 186.82±3.09 cm) were participated in this...

متن کامل

Safe Online Learning Using Barrier Functions

We present a method for guaranteeing the safety of online learning schemes. The method uses barrier certificates and Sums-of-Squares programming to find a safe region of state space and a controller which renders that space positively invariant. This safe set and controller are then used to create ”training wheels”, which can be added to any controller to create a guaranteed safe controller. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1709.10273  شماره 

صفحات  -

تاریخ انتشار 2017